This session is about the basics of stochastic actor-oriented models (SAOMs). We refer to the RSiena page https://www.stats.ox.ac.uk/~snijders/siena/ for further resources. Here we will
It is appropriate that we simulate the model as Siena stands for
Simulation Investigation for Empirical Network Analysis
We will use functionality from the network packages sna
and network (see https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/Markdowns/CHDH-SNA-2.Rmd).
The main packages for SAOM is RSiena.
First time you use them, install the packages (uncomment the install commmands)
# install.packages("sna")
# install.packages("network")
# install.packages("RSiena")
Once packages are installed, load them
library("sna")
library("network")
library('RSiena')
The RSiena package automatically loads the van de Bunt’s Freshman dataset (see Data description). We will use time-point 3 and 4
?tmp3# opens the helpfile with information about the dataset
class(tmp3)# both time-points are regular adjacency matrices
class(tmp4)
dim(tmp3)# 32 students by 32
n <- dim(tmp4)[1]
The original tie-variables had 6 levels but have here been dichotomised. It is good practice to check that ties are binary
table(tmp3,useNA = 'always')
table(tmp4,useNA = 'always')
RSiena handles missing data in estimation, but for the purposes of simulating and investigating the network, replace missing values
tmp3[is.na(tmp3)] <- 0 # remove missing
tmp4[is.na(tmp4)] <- 0 # remove missing
Plotting the networks with nodes in the same places illustrates what the SAOM will try to model
par(mfrow = c(1,2))
coordin <- plot(as.network(tmp3), main='Time 3')
plot(as.network(tmp4),coord=coordin, main='Time 4')
Looking closely we see that some arcs have been added and others been removed
In this session, we will assume that we have one initial network \(X(t_0)=x_{obs}(t_0)\), at time \(t_0\) and that we want to say something about a network \(X(t_1)\), at time \(t_1\), \(t_0<t_1\).
We will only use two waves but because of the Markov property of the model, all ingredients extend without limitations to several waves.
To simulate (but also estimate) we need to execute, in turn, each of the functions
sienaDependent - formats class matrix to
class sienaDependentsienaDataCreate - formats data to class
sienagetEffects - provides us with the effects we
can use for the processsienaAlgorithmCreate - sets up simulation/estimation
settingsAfter these steps, the model is simulated/estimated using
siena07
The function sienaDependent formats and translates the
two adjacency matrices to a “sienaDependent”
object:
mynet1 <- sienaDependent(array(c(tmp3, tmp4), # the matrices in order
dim=c(32, 32,2))# are set as two slices in an array
)
For networks, sienaDependent takes networks clued
together in an \(n \times n \times\)
number of waves, array.
You can define, as dependent variables, one-mode networks, two-mode networks, or dependent mondadic variables
Once you have defined your variables, these are combined into a
siena object using sienaDataCreate
mydata <- sienaDataCreate(mynet1)
The siena object adds all relevant information about the data
mydata
## Dependent variables: mynet1
## Number of observations: 2
##
## Nodeset Actors
## Number of nodes 32
##
## Dependent variable mynet1
## Type oneMode
## Observations 2
## Nodeset Actors
## Densities 0.15 0.18
To determined what effects are available for our combination of different types of data
myeff <- getEffects(mydata)
Assume a model where an actor \(i\), when they make a change, only cares about how many ties they have.
myeff <- includeEffects(myeff, recip,include=FALSE)# remove reciprocity which is DEFAULT
## [1] effectName include fix test initialValue
## [6] parm
## <0 rows> (or 0-length row.names)
myeff$initialValue[myeff$shortName=='Rate'] <- 5.7288
myeff$initialValue[myeff$shortName=='density'][1] <- -0.7349
For later reference, notice how
myeffis on the right-hand side of the allocation ofincludeEffects
What does the rate \(\lambda = 5.7288\) mean?
Each time the network has changed (or an oppotunity to change), each actor waits \(T_i \overset{i.i.d.}{\thicksim} Exp(\lambda)\) time.
The actor that gets to change is the one who waits for the shortest amount of time
waiting <- rexp(32, 5.7288)
hist(waiting)
winner <- which( waiting == min(waiting))
paste('The winner is actor: ', winner)
## [1] "The winner is actor: 29"
In the example of \(i=1\), deciding between creating a tie to \(j=2,4\) or breaking the tie to \(j=3\)
Micro-step
With our current model
\[ \Pr(X(1\leadsto 2)=\Pr(X(1\leadsto 4)=\frac{e^{\beta_1(1-2x_{12})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(-0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] and
\[ \Pr(X(1\leadsto 3)=\frac{e^{\beta_1(1-2x_{13})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] The conditional probability of \(i=1\) creating the tie to \(j=2\) is thus
exp(-0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
## [1] 0.1185728
and the conditional probability of \(i=1\) deleting the tie to \(j=3\) is thus
exp(0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
The function sienaAlgorithmCreate determines the
simulation/estimation settings. Here we are going to
simOnly = TRUEn3 = 100Networks, starting in \(X(t_3)\)
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',# name will be appended to output
cond = FALSE, # NOT conditioning on num. changes
useStdInits = FALSE,# we are changing some defaults
nsub = 0 ,# more about subphases in estimation
n3 = 100,# number of simulations (in Phase 3)
simOnly = TRUE)# we only want to simulate
We are ready to simulate!
The function siena07 is the main engine of RSiena and is
used to estimate all models (apart from hierarchical SAOMS)
sim_ans <- siena07( sim_model,# the name of our model
data = mydata,# all of our data - see above for what is in there
effects = myeff,# the effects we have chosen, including parameters
returnDeps = TRUE,# save simulated networks
batch=TRUE )# batch=FALSE opens a GUI
The networks are in sim_ans$sims but to help with
extracting and formatting them we read in the function
reshapeRSienaDeps
source("https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/data/reshapeRSienaDeps.R")
We apply it on sim_ans
mySimNets <- reshapeRSienaDeps(sim_ans,n)# n is the number of nodes (defined above)
The object mySimNets is an 100 by 32 by 32
array with 9 adjacency matrices
Plot networks with the same layout as for \(X(t_3)\) above
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
It is hard to spot any differences. Plot density of the networks
hist( gden(mySimNets ) )
abline( v = gden(tmp4), col='red')
If actors only care about how many ties they have, we predict the density at \(t_4\) correctly
How about the number of reciprocated ties \(x_{ij}x_{ji}=1\)
hist( dyad.census(mySimNets)[,1] ,xlim=c(9,50))
abline(v = dyad.census(tmp4)[1], col='red')
Clearly, if the only rule actors have is to care about the number of ties they have, this is not enough to explain why there are so many (46) reciprocal dyads
Simulate assuming that actors want to have reciprocated ties to others with the model with probabilities \(\Pr(X(i \leadsto j) |i \text{ makes decision})\): \[ p_{ij}(X,\beta)=\frac{e^{\beta_d (1-2x_{ij}) +\beta_r (1-2x_{ij})x_{ji} } }{ \sum_h e^{\beta_d (1-2x_{ih}) +\beta_r (1-2x_{ih})x_{hj} } } \]
with parameters \(\beta_d=-1.1046\) and \(\beta_r=-1.2608\):
myeff <- includeEffects(myeff, recip,include=TRUE)# reinstate reciprocity which is DEFAULT
## effectName include fix test initialValue parm
## 1 reciprocity TRUE FALSE FALSE 0 0
myeff$initialValue[myeff$shortName =='recip'][1] <- 1.2608
#### === We also need to change the other values
myeff$initialValue[myeff$shortName=='Rate'] <- 6.3477
myeff$initialValue[myeff$shortName=='density'][1] <- -1.1046
Log odds for a new graph
| \(x_{ji}\) | |||
|---|---|---|---|
| \(x_{ij}\) | 0 | 1 | |
| 0 | 0 | \(\beta_d\) | |
| 1 | \(\beta_d\) | \(\beta_d+\beta_r\) |
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
Plot, using the same code as before
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
Compare the observed number of reciprocated dyads to the simulated number of reciprocated dyads
hist( dyad.census(mySimNets)[,1])
abline(v = dyad.census(tmp4)[1], col='red')
With an actor preference of 1.26 for having their ties reicprocated we reproduce the reciprocity
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(8,40))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,24))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
A prefence for reciprocity is not enough to explain why reciprocated dyads hang together in triangles or why there are so many transitive triads
Add another positive parameter \(\beta_t\) for the preference for closing open triads.
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties
myeff <- includeEffects(myeff, transTrip,include=TRUE)
myeff$initialValue[myeff$shortName=='Rate'] <- 7.0959
myeff$initialValue[myeff$shortName=='density'][1] <- -1.6468
myeff$initialValue[myeff$shortName=='recip'][1] <- 0.8932
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.2772
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(2,55))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,86))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
Having preferences for reciprocated ties and transitive ties seem to explain ‘most’ of the structure
How did I pick the numbers
Manually, you could
Luckily, we have an algorithm (stochastic approximation) for automating this
We define the model in the same way as when we simulated data
myeff <- getEffects(mydata)# We already have our effects, but let's start from scratch
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties (it is alredy include by DEFAULT though)
myeff <- includeEffects(myeff, transTrip,include=TRUE)# we want to estimate the trasitivity effect
Set up the algorithm with default values
(siena07 will assume you want to estimate)
est_model <- sienaAlgorithmCreate(
projname = 'est_model',
n3 = 1000,# number of simulations in Phase 3 (default *is* 1000)
simOnly = FALSE,# default *is* FALSE
cond = FALSE,
useStdInits = FALSE)
Estimate!
est_ans <-siena07(est_model,# algorithm object
data = mydata, # the data object we created earlier
effects = myeff,# same effects as in simulation
returnDeps = TRUE,
batch=TRUE)
Estimation gives us an ANOVA table with - Parameter (point) estimates - Standard errors of estimates - Convergence statistis
summary( est_ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## 1. rate basic rate parameter mynet1 7.0328 ( 0.9465 ) -0.0318
## 2. eval outdegree (density) -1.6409 ( 0.1302 ) 0.0058
## 3. eval reciprocity 0.8943 ( 0.2109 ) -0.0109
## 4. eval transitive triplets 0.2746 ( 0.0431 ) -0.0315
##
## Overall maximum convergence ratio: 0.0921
##
##
## Total of 2318 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.896 0.004 0.010 -0.009
## 0.032 0.017 -0.010 -0.004
## 0.048 -0.368 0.044 -0.001
## -0.225 -0.628 -0.134 0.002
##
## Derivative matrix of expected statistics X by parameters:
##
## 11.672 5.133 4.124 50.798
## 67.740 217.080 144.440 1192.800
## 16.458 71.368 81.662 441.285
## 191.972 592.138 461.720 4590.234
##
## Covariance matrix of X (correlations below diagonal):
##
## 127.313 112.444 78.146 808.183
## 0.542 338.506 249.928 2159.994
## 0.449 0.881 237.609 1755.613
## 0.554 0.908 0.881 16709.070
These estimates look very much like the numbers that I picked arbitrarily - what makes these better
The observed statistics that we want to ‘hit’ on average are
est_ans$targets
## [1] 125 175 92 431
# the same as sim_ans$targets
The simulated statistics for the parameters from the estimation are
colMeans(est_ans$sf2)
## [,1] [,2] [,3] [,4]
## [1,] 124.641 175.107 91.832 426.925
# also provided in:
# est_ans$estMeans
The simulated statistics for the parameters from the simulation are
sim_ans$estMeans
## [1] 125.64592 174.72228 91.03856 429.86593
We can calculate within how many standard deviation units of the targets we are for each statistic \[ \frac{\bar{z}_k-z_{obs}}{sd(z_k)} \] The deviations using the estimates from estimation:
(colMeans(est_ans$sf2)-est_ans$targets)/sqrt(diag(var(est_ans$sf2[,1,])))
## [,1] [,2] [,3] [,4]
## [1,] -0.03181686 0.005815681 -0.01089877 -0.03152474
# Also provided in:
# est_ans$tstat
For estimates from simulation:
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
For both sets of parameters, the simulated statistics are on average within less that 0.1 standard deviations units of the observed statistics
As a quick illustration, we can see when we set rate, density, and reciprocity to the estimated values
myeff$initialValue[myeff$shortName=='Rate'] <- est_ans$theta[1]
myeff$initialValue[myeff$shortName=='density'][1] <- est_ans$theta[2]
myeff$initialValue[myeff$shortName=='recip'][1] <-est_ans$theta[3]
but pick another value for transitivity:
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.1
and then, simulate!
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 1000,
simOnly = TRUE)
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Calculate the scaled difference between the average statistics and the observed statistics again
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
The everage simulated statistics:
sim_ans$estMeans
are not close to the observed
sim_ans$targets
Decreasing the transitivity parameter results in networks having too few transitive triads
In general, we say that the estimation has converged and estimates can be estimated if
The aim of the estimation algorithm is to find estimates \(\hat{\theta}\), such that \[
\underbrace{E_{\hat{\theta}}\{ Z[X(t_1)] \mid
X(t_0)=x_{obs}(t_0)\}}_{\text{'average'
statistics}}=Z[x_{obs}(t_1)]
\] The stochastic approximation algorithm in siena07
solves this equation computationally in three
Phases:
You may notice that the difference that is minimised in this algoritm is not \(\bar{z}-z_{obs}\). Only one network is simulated in each interation - but it works (the other way also works but is less efficient)
The second column in the ANOVA table contains the nominal standard errors, i.e. the approximate standard deviations of the estimators of the \(\theta\)’s. Typically, these are used for standard hypothesis tests:
For effect/parameter \(k\), test \[ H_0:\beta_k=\beta_{k,0}=0 \] against \[ H_0:\beta_k\neq 0 \] The test statistic \[ \frac{\hat{\beta}_k-\beta_{k,0} }{sd(\hat{\beta}_k)}=\frac{\hat{\beta}_k }{sd(\hat{\beta}_k)}\approx \frac{\hat{\beta}_k }{se(\hat{\beta}_k)}, \] is approximately normally distributed \(N(0,1)\), if \(H_0\) is true.
Given the number of assumptions and approximations we need to make, I would advice that we stick to testing on the nominal 5% level, and reject \(H_0\) when the test statistic is greater than \(2\) in absolute value.
In the simple model we estimated, let us test \(H_0:\beta_t=0\). The observed test statistic
est_ans$theta[4]/est_ans$se[4]
## [1] 6.370043
is considerably larger than 2, and we can reject the null-hypothesis that the true value of the transitivity parameter is \(0\).
Intuitively, we could also test if we ‘need’ \(\beta_t\), by estimating the model with \(\beta_t=0\), and check the convergence t-statistic. You can do this yourself now!
There are many different types of covariates
| Type | RSiena type |
|---|---|
| Constant monadic covariates | coCovar |
| Changing monadic covariates | varCovar |
| Constant dyadic covariate | coDyadCovar |
| Changing dyadic covariate | varDyadCovar |
| Changing (covariate) network | sienaDependent |
The usual nomenclauture for measurement scales, and the distinction between metric and non-metric variables, applies.
The adjacency matrices s501, s502, and
s503, for the so-called s-50 dataset, the network of 50
girls, are also available with RSiena. For a full
description of this dataset see ?s50 or http://www.stats.ox.ac.uk/~snijders/siena/s50_data.htm
For clarity, we rename the adjacency matrices
friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503
Note that we have three (3) waves.
Among the many monadic covariates are ‘smoking’ and ‘drinking’:
drink <- s50a
smoke <- s50s
Here, each variable had its own \(n \times 3\) data matrix, one observation for each individual and each time-point
head(smoke)
We are going to test \[ H_0: \text{smokers have as many friends as non-smokers} \] Against an alternative hypothesis stating that there is a difference. We will interpret this as “sending as many ties” and “receiving as many ties” as non-smokers.
Pogues
We will use smoking at \(t_0\) as our explanatory variable and define ‘smoking’ as a value of 2 or 3.
smoke1.matrix <- as.numeric( smoke[,1]>1 )
table(smoke1.matrix, useNA='always')
To tell RSiena how to use this variable, we format it
smoke1 <- coCovar( smoke1.matrix )
Print to screen to see how R has interpreted the variable
smoke1
Format the dependent network variable as before:
friendshipData <- array( c( friend.data.w1,
friend.data.w2,
friend.data.w3 ),
dim = c( 50, 50, 3 ) )
friendship <- sienaDependent(friendshipData)
Using sienaDataCreate, we give RSiena the
oppotunity to figure out how data are structured and what types of
effects can be calculated from it
s50data <- sienaDataCreate( friendship, smoke1)
Since we have both a network as well as monadic covariates, we will have more effects avaialble to us that previously
s50.effects <- getEffects(s50data)
For testing our hypothesis we want a statistic that corresponds to \[ v_i x_{ij} \] for each tie-variable, and where \(v_i\) is one or zero according to whether \(i\) is a smoker or not, respectively. This is the sender effect
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
Next, we want a statistic that corresponds to \[ v_j x_{ij} \] so that if the corresponding parameter is positive, then actors are more likely to form (maintain) ties to actors \(j\) that have \(v_j=1\), i.e. are smokers
s50.effects <- includeEffects( s50.effects,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
For the vanDeBunt dataset, we saw that triadic closure was important. We can chose to include it because we believe that it is important but also as a control for our hypothesised effects
s50.effects <- includeEffects(s50.effects,# We "add and effect" to s50.effects
transTrip,# shortname of the effect
include=TRUE)
Our specified model is
s50.effects
## effectName include fix test initialValue parm
## 1 constant friendship rate (period 1) TRUE FALSE FALSE 4.69604 0
## 2 constant friendship rate (period 2) TRUE FALSE FALSE 4.32885 0
## 3 outdegree (density) TRUE FALSE FALSE -1.46770 0
## 4 reciprocity TRUE FALSE FALSE 0.00000 0
## 5 transitive triplets TRUE FALSE FALSE 0.00000 0
## 6 smoke1 alter TRUE FALSE FALSE 0.00000 0
## 7 smoke1 ego TRUE FALSE FALSE 0.00000 0
Specify the simulation settings
s50algorithm.simple <- sienaAlgorithmCreate( projname = 's50_simple' )# We are only using defaults
Estimating the model with default settings is simply a callt to siena07
s50.simple.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.simple.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.4725 ( 1.0404 )
## 0.2 Rate parameter period 2 5.2730 ( 0.8831 )
##
## Other parameters:
## 1. eval outdegree (density) -2.6889 ( 0.1229 ) -0.1023
## 2. eval reciprocity 2.4580 ( 0.2056 ) -0.0845
## 3. eval transitive triplets 0.6245 ( 0.0756 ) -0.0161
## 4. eval smoke1 alter -0.0157 ( 0.1734 ) 0.0570
## 5. eval smoke1 ego 0.0810 ( 0.1817 ) 0.0488
##
## Overall maximum convergence ratio: 0.1624
##
##
## Total of 2220 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.015 -0.016 -0.004 0.001 -0.001
## -0.614 0.042 -0.002 0.000 0.003
## -0.429 -0.158 0.006 -0.001 0.000
## 0.027 -0.002 -0.082 0.030 -0.007
## -0.032 0.068 -0.025 -0.208 0.033
##
## Derivative matrix of expected statistics X by parameters:
##
## 281.414 223.884 489.690 2.432 2.305
## 127.684 138.064 259.263 0.219 0.072
## 351.990 325.050 1169.890 15.901 11.704
## 4.308 4.487 18.212 41.115 20.689
## 2.631 -0.275 9.060 25.283 41.266
##
## Covariance matrix of X (correlations below diagonal):
##
## 466.522 402.905 1135.459 8.084 8.922
## 0.927 405.345 1104.583 7.262 7.054
## 0.805 0.841 4259.473 43.032 37.970
## 0.049 0.047 0.087 57.886 45.160
## 0.054 0.046 0.077 0.782 57.635
Are all convergence statistics are less than 0.1 in absolute value? If so we can test our hypothesis - do we reject our \(H_0\)?
To account for (test) possible assortativity on smoking, we add the homophily effect:
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
egoXaltX,# the shortname
interaction1 = "smoke1" # the variable we want to interact the DV with
)
s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.hom.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.4748 ( 1.1283 )
## 0.2 Rate parameter period 2 5.2921 ( 0.8601 )
##
## Other parameters:
## 1. eval outdegree (density) -2.6975 ( 0.1106 ) 0.0210
## 2. eval reciprocity 2.4441 ( 0.1885 ) 0.0395
## 3. eval transitive triplets 0.6113 ( 0.0786 ) 0.0436
## 4. eval smoke1 alter -0.0737 ( 0.1855 ) 0.0378
## 5. eval smoke1 ego -0.0158 ( 0.2037 ) 0.0296
## 6. eval smoke1 ego x smoke1 alter 0.6332 ( 0.3449 ) 0.0059
##
## Overall maximum convergence ratio: 0.0797
##
##
## Total of 2197 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.012 -0.011 -0.004 0.001 0.000 -0.003
## -0.537 0.036 -0.003 -0.004 0.003 0.001
## -0.422 -0.230 0.006 0.000 -0.002 -0.002
## 0.068 -0.116 0.022 0.034 -0.014 -0.012
## 0.011 0.082 -0.098 -0.368 0.041 -0.015
## -0.068 0.009 -0.088 -0.195 -0.213 0.119
##
## Derivative matrix of expected statistics X by parameters:
##
## 291.668 225.793 497.884 3.389 4.723 17.481
## 132.402 142.550 265.581 4.666 3.152 8.337
## 348.222 319.782 1149.917 12.115 13.868 28.577
## 10.932 11.249 25.251 46.809 31.274 11.297
## 5.451 2.805 16.632 29.704 42.730 9.994
## 18.161 16.561 43.225 11.069 11.323 13.880
##
## Covariance matrix of X (correlations below diagonal):
##
## 469.475 403.896 1138.588 10.821 8.497 36.436
## 0.929 402.607 1095.798 10.707 6.356 33.364
## 0.801 0.832 4304.684 25.151 15.073 99.800
## 0.062 0.067 0.048 64.005 53.121 18.962
## 0.049 0.040 0.029 0.837 62.923 18.636
## 0.379 0.375 0.343 0.534 0.530 19.663
If
we can test our hypothesis, controlling for possible assortativity/homophily
What about homophily on smoking - do smokers befried other smokers?
We are now going to use drinking alcohol as both a changing covariate and a dependent variable.
Since the alcohol variable is a dependent variable, we use
sienaDependent to format it (print to screen to get a quick
view of the information stored)
drinking <- sienaDependent( drink, type = "behavior" )
We now have a dataset with two dependent variables so we need to
create a new siena data object using sienaDataCreate:
s50data.infl <- sienaDataCreate( friendship, smoke1, drinking )
With an additional variable, additional effects are available (NB:)
s50.effects.infl <- getEffects(s50data.infl)
The dependent variable drinking can act as a covariate
on the network. We are going to use the ‘complicated’, five-parameter,
homophily parametrisation of Lomi and Snijder (2019):
s50.effects.infl <- includeEffects( s50.effects.infl,# again, 'adding' to the effects
egoX,# sender effect for alcohol
egoSqX,# non-linear sender effect
altX,# receiver effect for alcohol
altSqX,# non-linear receiver effect
diffSqX,# squared difference of alcohol ego and alcohol alter
interaction1 = "drinking" # DV works the same as an IV on another DV
)
## effectName include fix test initialValue parm
## 1 drinking alter TRUE FALSE FALSE 0 0
## 2 drinking squared alter TRUE FALSE FALSE 0 0
## 3 drinking ego TRUE FALSE FALSE 0 0
## 4 drinking squared ego TRUE FALSE FALSE 0 0
## 5 drinking diff. squared TRUE FALSE FALSE 0 0
The dependent variable drinking can act as a dependent
variable for other variables, including the other dependent
variables.
s50.effects.infl <- includeEffects( s50.effects.infl,# still augmenting effects structure
avAlt,# this is the shortname for "average alter"
name="drinking",# name: what is the dependent variable
interaction1 = "friendship" # the network is now "independent" variable
)
## effectName include fix test initialValue parm
## 1 drinking average alter TRUE FALSE FALSE 0 0
We use the same social selection effects for smoking, the sender effect
s50.effects.infl <- includeEffects( s50.effects.infl,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
and the receiver effect
s50.effects.infl <- includeEffects( s50.effects.infl,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
and the homophily effect
s50.effects.infl <- includeEffects( s50.effects.infl ,# we "add and effect" to our effects object
egoXaltX,# the shortname
interaction1 = "smoke1" # the variable we want to interact the DV with
)
We will use two (2) different transitivity effects
s50.effects.infl <- includeEffects( s50.effects.infl, transTrip, transRecTrip )
## effectName include fix test initialValue parm
## 1 transitive triplets TRUE FALSE FALSE 0 0
## 2 transitive recipr. triplets TRUE FALSE FALSE 0 0
Set up estimation settings:
s50.algo.infl<- sienaAlgorithmCreate( projname = 's50_infl' )
And we can estimate the model (it will take a little longer)
s50.est.infl <- siena07( s50.algo.infl,
data = s50data.infl,
effects = s50.effects.infl,
batch = TRUE,
returnDeps = TRUE )
We view the ANOVA table as before:
summary(s50.est.infl)
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate constant friendship rate (period 1) 6.3027 ( 0.9675 ) 0.0526
## 2. rate constant friendship rate (period 2) 5.0138 ( 0.8284 ) -0.0700
## 3. eval outdegree (density) -2.7672 ( 0.2373 ) 0.0550
## 4. eval reciprocity 2.8244 ( 0.3193 ) 0.0468
## 5. eval transitive triplets 0.8982 ( 0.1577 ) 0.0566
## 6. eval transitive recipr. triplets -0.5138 ( 0.2533 ) 0.0443
## 7. eval smoke1 alter 0.1159 ( 0.2352 ) 0.0060
## 8. eval smoke1 ego -0.0938 ( 0.2619 ) -0.0082
## 9. eval smoke1 ego x smoke1 alter 0.3110 ( 0.3684 ) 0.0172
## 10. eval drinking alter -0.0876 ( 0.1355 ) 0.0251
## 11. eval drinking squared alter -0.1345 ( 0.1369 ) 0.0207
## 12. eval drinking ego 0.0227 ( 0.1233 ) -0.0051
## 13. eval drinking squared ego 0.2088 ( 0.1159 ) 0.0328
## 14. eval drinking diff. squared -0.1079 ( 0.0532 ) 0.0410
##
## Behavior Dynamics
## 15. rate rate drinking (period 1) 1.3089 ( 0.3279 ) -0.0739
## 16. rate rate drinking (period 2) 1.7941 ( 0.4726 ) -0.0311
## 17. eval drinking linear shape 0.4065 ( 0.2180 ) -0.0628
## 18. eval drinking quadratic shape -0.5627 ( 0.3290 ) 0.0364
## 19. eval drinking average alter 1.2426 ( 0.8434 ) 0.0334
##
## Overall maximum convergence ratio: 0.1697
##
##
## Total of 3576 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.936 -0.080 0.022 -0.050 -0.022 0.038 -0.016 0.043 0.008 0.013 0.021 -0.008 -0.025 0.005 -0.018 -0.027 0.016 -0.041 0.119
## -0.100 0.686 0.012 -0.018 -0.028 0.023 -0.015 -0.009 0.010 0.005 0.011 0.009 -0.004 0.003 -0.001 -0.022 0.011 0.020 -0.074
## 0.094 0.060 0.056 -0.049 -0.021 0.028 0.006 0.017 0.002 0.003 -0.001 -0.006 -0.015 -0.004 0.000 0.000 -0.004 0.008 -0.016
## -0.161 -0.067 -0.645 0.102 0.030 -0.051 0.010 -0.011 -0.002 -0.009 -0.019 0.004 0.016 0.002 0.007 0.009 -0.002 0.000 0.000
## -0.146 -0.212 -0.550 0.589 0.025 -0.035 0.000 0.001 -0.006 -0.003 -0.003 -0.001 0.002 0.001 0.003 0.001 -0.001 -0.004 0.007
## 0.154 0.112 0.466 -0.632 -0.864 0.064 0.001 -0.002 0.009 0.005 0.000 0.001 -0.001 -0.002 -0.010 -0.002 0.005 -0.001 0.009
## -0.069 -0.077 0.112 0.138 0.004 0.013 0.055 -0.010 -0.002 -0.016 -0.012 0.003 0.003 0.000 0.005 0.009 0.000 0.000 0.003
## 0.168 -0.042 0.267 -0.135 0.016 -0.033 -0.155 0.069 -0.020 0.002 0.005 -0.015 -0.012 -0.001 -0.004 -0.007 -0.003 0.001 -0.004
## 0.022 0.034 0.018 -0.018 -0.102 0.100 -0.019 -0.207 0.136 -0.001 -0.010 0.002 0.000 0.003 -0.012 0.015 0.005 -0.016 0.040
## 0.101 0.045 0.108 -0.205 -0.148 0.132 -0.493 0.046 -0.018 0.018 0.004 -0.009 -0.002 0.000 -0.003 -0.010 -0.001 -0.001 0.007
## 0.158 0.099 -0.016 -0.426 -0.151 0.005 -0.383 0.134 -0.199 0.217 0.019 -0.001 -0.008 0.001 0.002 -0.009 0.000 0.003 -0.014
## -0.070 0.088 -0.201 0.111 -0.027 0.019 0.087 -0.473 0.044 -0.558 -0.086 0.015 0.003 0.001 0.005 0.009 0.002 -0.001 0.000
## -0.227 -0.039 -0.549 0.443 0.124 -0.038 0.101 -0.407 0.010 -0.119 -0.491 0.215 0.013 -0.001 -0.003 0.004 0.001 -0.002 0.008
## 0.098 0.065 -0.284 0.132 0.107 -0.164 0.031 -0.060 0.161 -0.044 0.073 0.095 -0.190 0.003 0.000 0.001 0.001 0.001 -0.001
## -0.056 -0.003 -0.004 0.064 0.059 -0.117 0.058 -0.042 -0.102 -0.066 0.052 0.130 -0.075 -0.009 0.107 0.014 -0.017 0.005 -0.013
## -0.059 -0.056 0.003 0.061 0.020 -0.020 0.083 -0.059 0.084 -0.154 -0.134 0.151 0.065 0.026 0.092 0.223 -0.030 0.003 0.002
## 0.075 0.060 -0.077 -0.035 -0.038 0.097 0.005 -0.047 0.057 -0.044 0.008 0.091 0.045 0.054 -0.233 -0.294 0.048 -0.036 0.092
## -0.128 0.073 0.106 -0.002 -0.083 -0.006 0.001 0.015 -0.129 -0.020 0.056 -0.026 -0.047 0.043 0.046 0.021 -0.506 0.108 -0.263
## 0.146 -0.105 -0.081 -0.002 0.050 0.041 0.014 -0.020 0.129 0.062 -0.121 0.002 0.083 -0.016 -0.046 0.005 0.498 -0.947 0.711
##
## Derivative matrix of expected statistics X by parameters:
##
## 11.043 0.000 1.378 0.841 8.235 6.411 -0.786 -0.985 -0.294 -3.048 0.400 -2.673 2.329 1.254 -0.218 0.000 -0.309 -0.661 -1.405
## 0.000 11.147 2.369 2.132 18.813 15.628 0.267 0.067 -0.298 0.343 5.626 -1.022 4.375 3.591 0.000 -0.217 -1.388 0.106 -0.235
## 43.154 26.463 307.573 232.038 647.425 484.535 -0.122 6.303 19.853 -23.508 401.192 -1.291 438.629 507.245 -0.048 -3.875 -0.547 -8.185 2.538
## 8.055 4.287 126.232 134.315 277.665 248.702 2.071 2.612 8.440 -3.127 170.401 2.308 168.681 186.354 -0.314 -1.716 -1.205 1.757 5.744
## 61.221 37.310 402.501 335.965 1455.709 1167.384 22.581 27.287 35.095 37.190 590.542 58.440 609.004 683.061 1.392 -6.073 0.560 1.175 6.531
## 17.017 9.622 189.495 188.328 706.351 640.589 7.561 9.332 14.558 4.649 282.330 16.453 276.293 334.699 1.878 -3.198 -2.603 -2.458 2.040
## 0.429 0.730 2.917 2.615 26.058 18.445 51.039 31.553 10.254 90.203 43.522 74.709 27.623 15.071 -0.678 1.757 5.086 1.464 -0.763
## 0.215 -0.147 8.635 5.710 33.137 23.239 31.733 45.909 11.240 68.536 40.500 79.920 52.990 32.196 0.648 -0.654 0.979 2.935 -0.239
## 5.269 2.159 20.865 17.246 63.473 49.286 9.713 10.822 13.829 15.371 41.663 16.276 43.721 9.839 -0.203 -1.009 -1.164 2.254 1.276
## -9.940 3.625 14.509 13.179 69.114 42.798 74.864 63.520 15.878 285.722 16.202 232.203 20.666 41.690 2.274 5.441 23.819 -2.587 -4.393
## 44.766 18.927 368.630 309.458 880.572 707.132 35.715 32.230 38.346 8.460 680.578 25.110 629.085 601.033 -4.630 -2.628 -7.289 14.177 22.861
## -6.501 7.238 46.705 30.093 138.157 83.930 68.038 78.406 20.503 237.429 76.698 314.991 72.613 42.505 -1.656 -2.605 1.682 8.449 1.765
## 70.947 34.647 382.199 257.174 787.913 559.658 24.515 42.249 35.595 -5.446 632.490 13.874 796.275 763.606 1.840 -6.056 -2.309 -19.763 -7.540
## 47.999 26.773 380.639 258.827 792.452 577.637 -6.747 11.075 8.779 -54.585 580.799 -16.526 658.041 1571.445 3.617 -7.002 -9.538 -63.395 -31.106
## 3.262 0.000 1.369 0.204 9.484 7.693 0.325 0.391 1.010 -3.494 9.167 -1.969 -0.103 22.069 14.687 0.000 9.221 -3.746 -2.189
## 0.000 0.265 -1.452 0.050 5.722 6.669 -0.043 -0.286 -0.854 0.979 -0.551 -0.824 -4.667 3.997 0.000 11.274 8.750 0.323 1.059
## -1.246 2.196 -10.081 -9.953 -22.400 -19.640 1.040 2.437 -0.211 -0.552 -17.347 11.206 -16.691 -25.958 7.321 6.472 54.839 11.283 11.261
## 5.013 4.601 6.097 6.080 52.952 42.383 -3.213 -2.342 1.713 -15.151 10.411 -4.814 16.871 -53.359 1.910 -3.621 1.353 140.885 94.870
## 1.390 2.188 8.491 8.027 32.087 26.484 -2.173 -1.843 0.560 -10.264 14.466 -6.479 14.517 -7.153 -0.517 -2.381 -6.133 50.646 44.849
##
## Covariance matrix of X (correlations below diagonal):
##
## 115.996 -9.899 79.752 53.744 271.508 206.030 -0.202 1.460 7.951 -33.269 109.393 -28.144 120.241 143.273 -1.905 -2.090 -7.082 5.909 4.499
## -0.099 85.946 48.014 29.242 144.802 106.792 3.251 3.436 1.844 19.224 68.152 22.241 70.378 83.111 1.309 -1.853 2.535 -1.694 -3.452
## 0.323 0.226 524.045 413.245 1407.661 1078.773 12.670 16.341 39.466 -0.101 730.984 25.819 764.078 882.547 -2.066 -4.805 -4.169 10.611 33.865
## 0.255 0.161 0.922 383.040 1202.045 983.011 11.892 12.455 31.476 9.591 586.427 18.028 594.380 670.009 -2.080 -2.274 -4.518 7.152 31.792
## 0.337 0.209 0.823 0.822 5583.581 4447.280 65.915 84.618 127.713 123.029 2109.839 185.829 2141.717 2493.800 -4.046 -4.514 -1.140 24.560 119.354
## 0.314 0.189 0.773 0.824 0.976 3716.927 49.669 61.176 97.342 84.427 1635.193 121.416 1619.715 1880.945 -4.689 -0.571 -0.495 17.956 102.543
## -0.002 0.040 0.064 0.070 0.101 0.094 75.756 59.041 19.238 143.100 77.583 132.755 57.233 52.285 2.265 1.753 8.471 -1.686 -2.279
## 0.016 0.044 0.085 0.076 0.135 0.119 0.806 70.841 20.163 125.094 73.843 141.214 77.325 58.147 1.497 -0.440 6.348 1.326 0.213
## 0.166 0.045 0.387 0.361 0.383 0.358 0.496 0.537 19.896 31.308 75.422 34.424 78.596 33.799 -0.377 -1.565 -1.087 1.093 4.016
## -0.135 0.091 0.000 0.021 0.072 0.061 0.718 0.649 0.307 523.687 17.678 450.789 3.005 36.024 5.045 5.393 24.569 -3.796 -14.738
## 0.268 0.194 0.841 0.790 0.744 0.707 0.235 0.231 0.446 0.020 1440.359 71.746 1337.457 1487.742 1.031 -4.798 -11.691 -1.172 46.373
## -0.110 0.101 0.047 0.039 0.105 0.084 0.641 0.706 0.325 0.828 0.080 565.428 46.365 89.319 9.958 2.527 38.205 2.973 -3.748
## 0.280 0.191 0.838 0.762 0.719 0.667 0.165 0.231 0.442 0.003 0.884 0.049 1587.958 1549.701 -12.394 -15.379 -15.374 28.905 59.752
## 0.221 0.149 0.640 0.568 0.554 0.512 0.100 0.115 0.126 0.026 0.650 0.062 0.645 3632.938 24.237 8.268 3.413 -80.817 -3.724
## -0.038 0.031 -0.020 -0.023 -0.012 -0.017 0.056 0.039 -0.018 0.048 0.006 0.091 -0.067 0.087 21.292 0.049 14.777 -0.913 -3.150
## -0.039 -0.041 -0.043 -0.024 -0.012 -0.002 0.041 -0.011 -0.071 0.048 -0.026 0.022 -0.078 0.028 0.002 24.210 13.244 -1.791 -0.952
## -0.070 0.029 -0.020 -0.025 -0.002 -0.001 0.104 0.081 -0.026 0.115 -0.033 0.172 -0.041 0.006 0.343 0.288 87.126 9.976 18.234
## 0.039 -0.013 0.033 0.026 0.023 0.021 -0.014 0.011 0.017 -0.012 -0.002 0.009 0.051 -0.095 -0.014 -0.026 0.076 198.574 132.932
## 0.032 -0.028 0.112 0.123 0.121 0.128 -0.020 0.002 0.068 -0.049 0.093 -0.012 0.114 -0.005 -0.052 -0.015 0.148 0.716 173.638